WIMP astronomy and particle physics with liquid-noble and cryogenic 

direct-detection experiments 
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Once weakly-interacting massive particles (WIMPs) are unambiguously detected in direct- 
detection experiments, the challenge will be to determine what one may infer from the data. Here, 
I examine the prospects for reconstructing the local speed distribution of WIMPs in addition to 
WIMP particle-physics properties (mass, cross sections) from next-generation cryogenic and liquid- 
noble direct-detection experiments. I find that the common method of fixing the form of the velocity 
distribution when estimating constraints on WIMP mass and cross sections means losing out on the 
information on the speed distribution contained in the data and may lead to biases in the inferred 
values of the particle-physics parameters. I show that using a more general, empirical form of the 
speed distribution can lead to good constraints on the speed distribution. Moreover, one can use 
Bayesian model-selection criteria to determine if a theoretically-inspired functional form for the 
speed distribution (such as a Maxwell-Boltzmann distribution) fits better than an empirical model. 
The shape of the degeneracy between WIMP mass and cross sections and their offset from the true 
values of those parameters depends on the hypothesis for the speed distribution, which has signif- 
icant implications for consistency checks between direct-detection and collider data. In addition, I 
find that the uncertainties on theoretical parameters depends sensitively on the upper end of the 
energy range used for WIMP searches. Better constraints on the WIMP particle-physics parameters 
and speed distribution are obtained if the WIMP search is extended to higher energy (~ 1 MeV). 

PACS numbers: 07.05.Kf,14.80.-j,95.35.+d,98.62.Gq 
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I. INTRODUCTION 

Dark matter makes up ~ 23% of the energy density 
of the observable Universe, yet its identity is unknown 
(e.g., [![). While there are a number of well-motivated 
particle-physics candidates for dark matter (e.g., @t3|)) 
the most popular particle class is the weakly-interacting 
massive particle (WIMP) @. This class of dark-matter 
candidate is popular because a number of particles in this 
class arise "for free" and at the ri ght relic abundance in 
extensions to the standard model Moreover, due to 
their weak but non-negligible coupling to standard-model 
particles, it is possible to detect them. Candidates in 
this class include the supersy mmetric neutralino and the 
Kaluza-Klein photon [10Hl3l | . 

There is a wide variety of efforts focused on finding 
and characterizing WIMP dark matter, which can be 
broadly classified as creating (in colliders), destroying 
(by annihilation in dark-matter-dense astrophysical ob- 
jects), or colliding with WIMPs (using nuclei in terres- 
trial detectors) [ijj fl4H20lj. This last method, called 
"direct detection", is the subject of this work. There 
is a broad ongoing effort to find and identify WIMPs 
using direct-detection experiments. Currently, only the 
DAMA/LIBRA collaboration claims a direct detection of 
dark matter [2l[, a controversial clai m g iven the nonde- 
tections from other experiments [22h24| . Experimental 
efforts can be roughly divided between those focused on 
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detecting WIMPs through their spin-dependent (axial- 
vector) couplings to nuclei and those focusing on spin- 
independent scattering on nuclei. 

The most mature technologies are those associated 
with searches for spin-independent (SI) WIMP-nucleon 
scatters. Cryogenic experiments such as CDMS, Edel- 
weiss II, CRESST, and CoGeNT can distinguish nuclear 
from electronic recoils using different (ionization, scintil- 
lation, and heat) signals [251 - 430 ] . Liquid- noble gas ex- 
periments such as XENON100, LUX, XMASS, WArP, 
ArDM, DEAP/CLEAN, DarkSide, and Zeplin-III can 
distinguish between the two types of recoils using a 
combination of the amount of scintillation light, ioniza- 
tion yield, pulse shape, and timing (31rl36| . These ex- 
periments can resolve the energies but not directions 
of the recoils. The current best limits on the spin- 
independent WIMP-proton cross section (<7p ) arise from 
using < 1000 kg • day of data, and at are the level of 
a s P 1 < 4 x 10~ 44 cm 2 for a WIMP mass m x « 50 GeV. 
The targets for these experiments are increasing rapidly, 
with ^ton-scale liquid-noble and ~ 100 kg cryogenic ex- 
periments expected to be operational within the next five 
years (in or around year 2015) [HI, [H, H3] ■ Experiments 
an order of magnitude bigger than those are being dis- 
cussed, to be constructed approximately ten years from 
now [HO, [M 11 HI. Those 2020- to 2025-era experi- 
ments should have WIMP sensitivities 4 to 5 orders of 
magnitude better than those today. 

The question is, if these next-generation direct- 
detection experiments see unambiguous WIMP signals, 
what will we learn about WIMPs from them? Most of the 
effort thus far has been focused on determining how well 
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one may infer the WIMP mass and cross sections. These 
are fundamental particle-physics WIMP parameters that 
will allow us, in combination with indirect detection and 
production at colliders, to determine to which extension 
to the standard model the WIMP belongs. However, 
the energy spectrum of events in direct-detection exper- 
iments depends not only on the WIMP mass and cross 
sections, but also on the dark-matter distribution func- 
tion (DF). Thus, any inference of the WIMP mass and 
cross sections from the data also depends on the DF (see 
Eq. [Din Sec. HTJ . The WIMP DF is typically modeled 
with a fixed theoretically-inspired form (e.g., an isotropic 
Maxwcll-Boltzmann distribution or direct fits to N-body 
simulations) in which the parameters of the model (e.g., 
the one-dimensional velocity dispersion v xms ) are either 
fixed or only allowed to vary in a narrow range pol - l47l |. 
Implicit in this treatment of the WIMP DF is that it 
is well described by a globally smooth dark-matter halo 
model. 

However, the actual local dark-matter DF is unknown. 
Even if the local dark-matter DF is dominated by a 
halo component, we do not know exactly what to ex- 
pect. High- resolution dark- matter-only N-body simula- 
tions indicate significant halo-to-halo variation in the DF 
of the smooth component of the halo as well as ~ kpc 
scale fluctuations in the DF that are dynamically cold 
imprints of the halo accretion history [HI, |49[ . The ve- 
locity distribution is typically anisotropic. In addition to 
a smooth halo component to the local DF, there could 
also be significant contributions from a dark disk [5(il - [54j 
or small-scale velocity streams (below the resolution limit 
of simulations) that have not yet phase mixed. 

Direct-detection experiments and neutrino searches for 
WIMP annihilation in the Sun and Earth are the only 
probes of the local DF of WIMPs, unless there is a sig- 
nificant velocity-dependence in the annihilation cross sec- 
tion. While there have recently been some attempts to 
constrain the WIMP mass and cross sections by "inte- 
grating out" the uncertainty in the WIMP velocity dis- 
tribution [H^, [56| , it is highly desirable to use the direct- 
detection data to understand the WIMP DF as well as 
the particle-physics properties of dark matter. 

In this work, I explore the prospects for determining 
the WIMP speed distribution (the integral of the DF over 
configuration-space volume and velocity orientation) for 
several benchmark points in m x — er^space and velocity 
distribution models from 2015-era cryogenic and liquid- 
noble direct-detection experiments. Using a Bayesian 
framework to analyze mock data sets, I show that one 
may infer the WIMP speed distribution as well as the 
WIMP mass and cross section from even a modest num- 
ber of events, assuming that WIMP events are identified 
in at least one 2015-era direct-detection experiment. 

I consider several scenarios. First, I show how well 
one may characterize the speed distribution as well as 
the WIMP particle-physics parameters if the hypothe- 
sis for the speed distribution matches the data, but for 
which the parameter values of the hypothesis are pre- 



viously unknown. Since parameter constraints are most 
accurate and unbiased if the hypothesis is correct, this is 
a demonstration of the best constraints we can get from 
the data. Second, I consider the case in which the hy- 
pothesis for the speed distribution is wrong, as would be 
the case if the local WIMP population had dark-disk and 
stream components in addition to a smooth halo compo- 
nent, but one were to analyze data with the hypothesis 
that only a single velocity component exists. Finally, I 
show the constraints one obtains on the WIMP mass, 
cross sections, and speed distribution with the hypoth- 
esis of a simple empirical form for the speed distribu- 
tion. This is a proof of principle of the usefulness of 
empirical speed-distribution models for parameter esti- 
mation and Bayesian model selection. While this is not 
the first exploration of empirical treatments of the WIMP 
speed distribution [57H60J, the unbinned likelihood and 
the Bayesian framework I employ below have the advan- 
tage of being easily modified to incorporate backgrounds, 
systematic errors, and additional data sets of various 
types (not limited to direct detection). In addition, this 
work highlights the importance of the hypothesis for the 
form of the WIMP speed distribution in inferring WIMP 
particle-physics parameters from direct-detection data. 

The outline of this paper is as follows. In Sec. HH I 
describe the ansatze and methods used to infer WIMP 
properties and the speed distribution from mock data 
sets. In Sec. IIII1 I apply the methods in Sec. |H] to 
mock data sets for a set of WIMP particle-physics and 
speed-distribution benchmarks. In Sec. IIV1 I discuss the 
implications of the results of Sec. IIIII for estimating the 
local WIMP speed distribution in the future, and discuss 
the results in the context of WIMP characterization us- 
ing a combination of data sets, including those from the 
Large Hadron Collider. The key points of this work are 
summarized in Sec. [Vj 



II. ANSATZ & METHOD 

The plan is to estimate how well one may reconstruct 
the WIMP speed distribution as well as the particle- 
physics properties of WIMPs (mass, cross sections) in 
2015-era liquid- noble and cryogenic direct-detection ex- 
periments. These experiments can resolve the energy of 
WIMP-induced nuclear recoils but not the direction of 
the recoiling nucleus. 

In the absence of energy errors, the differential event 
rate per kilogram of a target N with nuclear mass mjv 
in an direct-detection experiment is 

dR ( m N \ X f 3 da N m 

where da^/dQ is the differential scattering cross section, 
/(x, v) is the local dark-matter DF, and 

v min = (tojvQ/2mw) 1/2 ( 2 ) 
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is the minimum speed required for a particle of mass 
m x to deposit energy Q to the nucleus if the interaction 
is elastic, and //jv is the WIMP-nucleus reduced mass. 
In this work, I assume that the interactions are elastic, 
deferring the discussion of inelastic interactions to future 
work [6l|. 

If one neglects the annual modulation of the direct- 
detection signal (due to the Earth's motion relative to 
the WIMP velocity distribution), the integral over the 
WIMP directions is time independent, and one can thus 
consider the speed distribution of WIMPs rather than the 
full velocity distribution. The speed distribution g(v) is 
defined such that 

g(v) = J dO„/(x, v)/n x (3) 

Jg(v)v 2 dv = 1, (4) 

where n x — p x /m x is the number density of dark-matter 
particles. Implicit in Eq. ([3]) is the assumption that the 
number density is constant throughout the duration of 
the experiment, i.e., that the number density does not 
vary significantly along the Earth's path through the So- 
lar System. Annual modulation provides an interesting 
constraint on the full velocity distribution, not just the 
speed distribution, but I will defer a discussion of this to 
future work. 

I create mock direct-detection data sets using a vari- 
ety of particle-physics (Sec. Ill A[) and speed-distribution 
benchmarks (Sec. IIIBI) for a set of toy-model 2015-era 
experiments (Sec. Ill C[) . I estimate particle-physics and 
speed-distribution parameters from the mock data sets 
using the likelihood and sampling techniques described 
in Sec. iHDl 

A. Particle physics 

For the time being, I assume that the spin-dependent 
(SD) WIMP-proton cross section <Jp D = 0, and that all 
events result from spin-independent elastic scattering. 
The scattering cross section for Eq. ([1} for a target nu- 
cleus with atomic number A is thus (e.g., Ref. [Tljj ) 

^ - ^^FlAQ), (5) 

where uia is the nuclear mass, /j, p is the reduced mass 
of the WIMP-proton system, and Fsi(Q) is the nuclear 
form factor. I assume that the coupling of WIMPs to 
protons is identical to the WIMP-neutron coupling, and 
use a Helm form factor for F$i [62j . 

B. Astrophysics 

As benchmark models for the mock experiments, I take 
one or more isotropic Maxwell-Boltzmann distributions, 



which in a frame corotating with the Earth have the form 

( 27 ™?ms) 3/2 

Here, p x is the local WIMP density, v rms is the one- 
dimensional velocity dispersion of particles, and i>i ag is 
the relative speed of the center of the Maxwell-Boltzmann 
distribution with respect to the experiments. The astro- 
physical reason for choosing this model for the WIMP 
velocities is described in Sec. IIII Al I choose to use distri- 
butions that are isotropic in the rest frame of the WIMPs 
for simplicity, although, in general, anisotropic velocity 
distributions are expected [48j . In principle, one can in- 
put an arbitrary speed distribution to an analysis of the 
type done in Sec. IIIII but that is beyond the scope of 
this work. 

For this work, I do not cut off the DF above an es- 
cape velocity v esc from the Galaxy, although this is an 
easy thing to add. The key points of this work hold re- 
gardless of the inclusion or exclusion of v csc in the DF. 
Moreover, there may be WIMPs passing through the ex- 
periments that lie above the escape speed, as the Milky 
Way is certainly not in dynamical equilibrium [63T - [65j . I 
also neglect the effect of gravitational focusing due to the 
gravitational potential wells of the Earth and Sun. How- 
ever, gravitational focusing is most relevant for WIMPs 
with speeds v < 100 km s^ 1 , which, as I show in Sec. 
IIII C[ are not generally accessible to the types of experi- 
ments described in Sec. Ill CI 

I define the "standard halo model" (SHM) as a single 
Maxwell-Boltzmann distribution with vi ag — 220 km s~\ 
which is the IAU value for the speed of the local standard 
of rest (LSR) [Hj]. This value is ~ 10% lower than that 
inferred from recent astrometric measurements of masers 
in star- forming regions in the Milky Way [671-69] . The 
rms speed for the SHM is taken to be Wrms^wiag/v^- 
The factor of y/2 arises from the Jeans equation if one 
approximates the density profile of the galactic halo as 
p(r) cx r~ 2 and the rotation curve as flat (see Ref. [7(ij . 
Appendix A of Ref. [zj|, and Sec. IIII A|) . 

Simulations of disk galaxies in dark-matter halos show 
that massive satellites are preferentially dragged into the 
disk plane, where they subsequently disrupt due to tidal 
forces [H(| 152l - l54l |72| . The disrupted dark matter settles 
into flarcd-disk-likc structure coincident on the baryonic 
disk, thus forming a "dark disk." Disk galaxies are gener- 
ically expected to have dark disks, although the proper- 
ties of the disk depend strongly on the accretion history 
of the host halo. Thus, we expect that the local WIMP 
DF should have a dark-disk component. Using Ref. [52[ 
as a guide, I define a "standard dark disk" (SDD) ve- 
locity distribution as having the form of Eq. © with 
vi ag = 100 km s^ 1 and v rms = 50 km s _1 . The weight of 
the SDD with respect to the halo models will be described 
in Sec. IIII Bl in which I consider multimodal speed dis- 
tributions. 
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C. Toy experiments 

I simulate data sets for four idealized 2015-era exper- 
iments. The first two experiments use liquid xenon as 
their target material, inspired by the planned XENON1T 
and under-construction LUX experiments [H, [zH • The 
third toy experiment is based on the several ton-scale 
liquid-argon experiments planned and under construc- 
tion (e.g., the various experiments in the DEAP/CLEAN 
program (32[, ArDM [Hj]). The last toy experiment is 
based on the under-construction SuperCDMS cryogenic 
germanium experiment [37] ■ I assume a total xenon mass 
of 1 ton and an exposure of 1 year for the "XENON1T- 
like" experiment, 350 kg of xenon and an exposure of 1 
year for the "LUX-like" experiment, 1 ton of argon and 
an exposure time of 1 year for the "argon" experiment, 
and 100 kg of germanium and a 1-year exposure for the 
"SuperCDMS-like" experiment. 

Note that I do not consider constraints on the speed 
distribution and WIMP particle-physics parameters for 
each experiment individually. As I [43| and others [H, 
[58j have shown, unless one fixes either the WIMP mass 
or the speed distribution of WIMPs a priori, one does 
not obtain meaningful parameter estimates from a single 
experiment. This is because one needs to have a handle 
on what sets the energy scale for the recoils: the WIMP 
mass or the WIMP speeds, since the recoil energy is given 
by 



Q = -^. v 2 (l-cos( 



(7) 



where 9 is the center-of-mass scattering angle. The true 
power comes in having a variety of experiments with dif- 
ferent target nuclei, which allows one to break the de- 
generacy between WIMP mass and WIMP speeds in the 
recoil energy spectrum. Moreover, many experiments do 
and will continue to run simultaneously, and there is no 
reason not to consider the combined constraints from all 
experiments. 

These toy experiments are idealized in that I assume 
that backgrounds are negligible, and that they have per- 
fect energy resolution and no systematic errors. The 
reasons for choosing such idealized scenarios are the 
following. First, the actual background rates and en- 
ergy resolution for the 2015-era experiments are un- 
known, although the goal of most experiments is to get 
to the zero-background regime. Energy errors for the 
current germanium-based experiments are negligible for 
parameter-estimation purposes [43j], but are potentially 
a major issue for liquid-noble experiments. For example, 
there is currently a large systematic error on the inferred 
nuclear recoil energies based on the scintillation light ob- 
served in xenon-based experiments 0, [z3~[z3] • Exper- 
iments are underway to better characterize the relation 
between the energy seen in experiments and the nuclear 
recoil spectrum, so it is likely that the energy resolution, 
systematics, and background sources will be far better 
characterized in the future than they are now. Second, 



by using idealized experiments, I show the minimum ex- 
pected uncertainty in the WIMP parameters. Any back- 
grounds and energy errors are likely to increase the ex- 
pected uncertainty in those parameters. If the methods 
I used in Sec. IIIII had failed for even ideal set of experi- 
ments, they would have certainly failed on the real deal. 

There are two key features of current experiments that 
I keep. First, I approximate the experimental efficiency 
£(Q) for each type of experiment to resemble those of 
current or recent experiments. This efficiency is the prob- 
ability that if there is a nuclear recoil of energy Q some- 
where in the experimental volume, it survives the selec- 
tion cuts into the analysis. The efficiency £ (Q) includes 
both a fiducial volume cut as well as the acceptance prob- 
ability within the fiducial volume. I use the same efficien- 
cies as used in Ref. (43|. Second, I retain the analysis 
windows (i.e., the nuclear recoil search window from the 
threshold energy Q mm to the maximum considered en- 
ergy Qmax) of current experiments, because as I show 
below in Sec. IIIII the analysis window strongly affects 
parameter estimation. (Q m i n , Q ma , x ) is (2 keV, 30 keV) 
for the XENONlT-like experiment, (5 keV, 30 keV) for 
the LUX-like, (30 keV, 130 keV) for the argon experi- 
ment, and (10 keV, 100 keV) for the SuperCDMS-like 
experiment. 



D. Parameter estimation 

Once I simulate mock data sets, I assess the parameter 
constraints using an unbinned likelihood function. The 
probability that a single recoil is observed with energy Q 
and with theoretical parameters {9} and with experimen- 
tal parameters (target nucleus, Q m i n , Q max , etc.) {7} is 




Pi(Q\{e},{i}) = 



g(Q,{7}W<*Q({g},{7» 
dQ>£(Q>, {j})dR/dQ<({9}, {7}) 



,(8) 



such that the likelihood of getting N l e events of energy 
{Qi, Q21 •••) Qj} m each experiment i is 



JV 



£({Q}IW) = II 



{Nl) 



N' 



- N' 



N' 



l[Pi(Q)\{9},{j*}), (9) 



where N is the number of experiments and N l D is the 
number of events observed in experiment i. This form of 
the likelihood is currently used by both the CDMS and 
XENON100 experimental groups [781179]. 

I use a Bayesian framework in which to determine the 
parameter uncertainties. In this framework, the prob- 
ability of the theoretical parameters of a given model 
hypothesis and the data, is 



P({9}\{Q}) cx £({Q}\{9})P({9}), 



(10) 



which is also known as the posterior. The coefficient 
relating the two sides of Eq. (fTU)) is irrelevant for pa- 
rameter estimation, so I replace "oc" with "=" in that 
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equation. P({9}) is the prior on the parameters. I use 
the publicly-available MultiNest nested sampling code 
to sample the posterior and determine parameter uncer- 
tainties [HI El]]. For the results in Sees. ITTTAl and ITTTbI 
I used 11000 live points for MultiNest, and 16000 live 
points for the results in Sec. IIII CI For all the results 
discussed in Sec. IIII| I used a sampling efficiency of 
ef r= 0.3 and a tolerance on the accuracy of Bayesian 
evidence of tol= 10~ 4 . The values of ef r and tol were 
chosen to get a good estimate of the maximum likeli- 
hood £ max and the Bayesian evidence Z. The latter is 
the integral of the posterior over the volume of theoretical 
parameters 

Z(H\{Q}) = Jd{9}P({9},{Q}). (11) 

Here, H is the hypothesis for the model For ex- 

ample, a model hypothesis would be that all recoils in 
the direct-detection experiments are due to elastic scat- 
ters between WIMPs and nuclei and that the WIMP dis- 
tribution function is described by a Maxwell-Boltzmann 
distribution. Both £ max and Z need to be calculated 
for Bayesian model-selection criteria, which I discuss in 
greater detail in Sec. IIII CI 

In addition, the low values of ef r and tol allow one 
to estimate of the profile likelihood [HJ , which is defined 
as 

C P ({Q}\6i) = max (£({Q}\9 U {9})) , (12) 

i.e., the maximum likelihood for a subset of the theo- 
retical parameters fixed, over the space of the remaining 
theoretical parameters [83[ . The profile likelihood is use- 
ful to calculate in addition to the marginalized posteriors 
to get a sense of whether the confidence limits based on 
the posterior are due to the size of the parameter space or 
due to high values of the likelihood. See Refs. [83l,T84l. l86j 
for more discussion. In the following sections, I show con- 
fidence limits based on the marginalized posteriors and 
not the profile likelihood, using the latter as a sanity 
check. 

The WIMP mass was sampled logarithmically in the 
interval 1 MeV < m x < 100 TeV, and the WIMP 
cross-section parameter D = Px&p 1 l m \ was sam- 
pled logarithmically from 10 -60 GeV~ 1 cm~ 1 < D < 
10~ 40 GcV _1 cm _1 . The speed-distribution parameters 
were sampled linearly, as described in Sec. Mil 

It took MultiNest approximately 4 CPU-hr to con- 
verge for each ensemble of mock data sets in Sees. IIII Al 
and IIII B1 on a single processor on the University of Cal- 
ifornia, Irvine's Greenplanet cluster, and from 18 to 150 
CPU-hr for each ensemble in Sec. IIII CI depending on 
the dimensionality of the parameter space and the size 
of the data sets. I found that the code slowed down dra- 
matically if the number of parameters in the hypothesis 
exceeded ~ 10. 



III. RESULTS 



In this section, I apply the analysis techniques in 
Sec. Ill Dl to mock data sets for several points in WIMP 
particle-physics and speed-distribution parameter space. 

In Sec. IIII Al I estimate how well one may estimate 
flag and v Tms for single-Maxwell-Boltzmann distribution 
benchmark speed distributions of the form © with a 
single-Maxwell-Boltzmann hypothesis. Most forecasting 
studies have focused on a single benchmark speed distri- 
bution, but I show how the uncertainty on the WIMP 
mass, elastic scattering cross section, v\ ag and w rms de- 
pends sensitively on the underlying values of v\ ag and 

^rms ■ 

In Sec. IIIIBI I consider the case that the speed 
distribution is multimodal, but analyze the mock data 
sets with the hypothesis that the velocity distribution is 
Maxwell-Boltzmann. The goal is to determine how bi- 
ased the inferred WIMP mass and cross sections might 
be. 

In Sec. IIII CI I analyze mock data with the SHM 
and Sec. IIIIBI multimodal benchmark speed distribu- 
tions with the hypothesis that the speed distribution is a 
set of five step functions in geocentric speed. This model 
of the speed distribution is supposed to be representa- 
tive of a class of empirical models that may be used to 
fit the data. While it is almost certainly not the opti- 
mal empirical hypothesis, it allows me to explore how 
well one may recover the WIMP mass, cross section, and 
speed distribution without a fixed, theoretically-inspired 
form for the DF. In addition, I show that even for fairly 
small data sets, one may use Bayesian model-selection 
techniques to determine the relative quality of the fits 
for different hypotheses for the speed distribution. 

In each section, I only consider one value of the param- 
eter D = p x a^ /m 2 , setting D = 3x 10~ 45 GeV^cm- 1 . 
I consider this parameter instead of treating cr^ and p x 
independently because of the total degeneracy of these 
parameters in direct-detection signals. Only with outside 
information on a^ 1 (e.g., from future collider data sets) 
or p x may one place limits directly on the other param- 
eter. If one assumes p x = 0.3 GcV cm~ 3 [87, 88], then 
the fiducial value of D implies a^ 1 10~ 44 cm 2 , which 
is a factor of several below the minimum of the current 
m x — dp 1 exclusion curve. This value of D should be ac- 
cessible to next-generation direct-detection experiments. 
Note, though, that the exclusion curve is constructed 
by fixing the WIMP speed distribution to a particular 
model. 

In both Sees. IIII Al and IIIIBI there are four free pa- 
rameters to fit: m x , D, flag, and t> rms . In Sec. IIII CI the 
number of free parameters is two plus the number of step 
functions used to describe the speed distribution. 
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A. Single Maxwell-Boltzmann Distribution, In and 

Out 

The first test is to see how well one may infer WIMP 
particle-physics and speed-distribution parameters in the 
case that the hypothesis for the form of the speed distri- 
bution matches the form of the true distribution. In par- 
ticular, I focus on parameter constraints for the bench- 
mark SHM and variations to it, making the most min- 
imal of prior assumptions about any of the parameters 
of the WIMP and Maxwell-Boltzmann model hypothe- 
sis: {9} = {m x , -D, «iag, Wrms}- While previous forecast- 
ing studies have considered a variety of benchmark m x , 
nearly all (with the exception of Refs. [H|,|44]]) have con- 
sidered only one fiducial speed distribution with fixed 
i>ia g and v Tms . However, even with the ansatz that the 
local WIMP density is dominated by a smooth, equilib- 
rium halo component (neglecting the accretion-history- 
dependent features seen in high-resolution N-body sim- 
ulations and any anisotropy in the velocity ellipsoid 
[48l |49} ) with one of the theoretically- inspired forms of 
the speed distribution, there is still a great deal of uncer- 
tainty on the appropriate values of v\ ag and i> rms for the 
Milky Way. 

With the ansatz that the local WIMP DF results from 
a smooth, equilibrium, nonrotating dark- matter halo DF, 
the appropriate choice for vu g is the sum of the velocity of 
the LSR [70j , solar motion (the peculiar speed of the Sun 
relative to the LSR) [89[, and the velocity of the Earth 
about the Sun. The largest uncertainty on any of those 
components is on the LSR. While the IAU standard is 
^ls r = 220 km s _1 with approximately 10% uncertainty 
[66} , more recent measurements of the rotation curve and 
of the mass of the Milky Way halo indicate that slightly 
larger values are preferred [63, However, the uncer- 
tainty in the speed of the LSR from any measurement 
in the past several decades has not changed (see, e.g., 
Ref. [9l[), so the range of plausibility for the speed of 
the LSR is 200 - 270 km s _1 . With the addition of solar 
motion and the velocity of the Earth about the Sun, in 
this work I consider the range of plausibility for «i ag to 
be 220- 280 km s _1 . 

It is not clear what the best choice for w rms is. For a 
power-law dark-matter density profile p(r) cx r _/3 and a 
flat rotation curve, it can be shown that the distribution 
function is Maxwell-Boltzmann velocity dispersion 



= "lag/ \fP 



(13) 



if one assumes that the velocity ellipsoid is isotropic [71| . 
If dark-matter profiles are described by a Navarro-Frenk- 
White density profile with a scale radius r s , then p(r) cx 
r _1 for r <C r s , p(r) cx r~ 2 for r ~ r s , and p(r) cx r~ 3 
[92l [93l ] . Neglecting the effects of baryons on dark-matter 
halos, for a dark-matter halo of mass (1 — 3) x 10 12 M Q 
(the plausible range of values for the Milky Way's virial 
mass [67], 0) 01)) the typical scale radius should be of 
order 10 — 30 kpc [95} . Given that the Sun sits ~ 8 kpc 
from the Galactic center [96], it is plausible that j3 ~ 1—2. 



The first step in this analysis is to see how the con- 
straints on m x and D are affected by the underlying 
WIMP speed distribution. I consider three benchmark 
WIMP masses: m x = 50,100, and 500 GeV. I bracket 
the range of plausible v\ ag and v Tms with the following 
benchmark Maxwell-Boltzmann DFs: the SHM; vi ag — 
220 kms _1 and v Tms = 220 km s" 1 ; v Ug = 280 km s" 1 
and i>rms= 200 km s — ; and t>i ag = 280 km s _1 and u rms = 
280 km s . The mock data sets had of order 100 events 
for the LUX-like experiment, of order tens to a hundred 
events for the argon experiment, of order ten or tens 
for the SuperCDMS-like experiment, and several tens to 
hundreds of events for the XENONlT-like experiment. 
The latter has a relatively high number of events due to 
the low energy threshold Q mm = 2 keV. The total number 
of events in all toy experiments decreased with increas- 
ing WIMP mass due to the fact that the number den- 
sity of WIMPs n x cx m" 1 and that the typical WIMP 
speeds were high enough that there were many events 
above threshold for all experiments. 

As described in Sec. Ill D[ I sampled D and m x loga- 
rithmically for the MultiNest nested sampler. I sam- 
pled vi a g and v rms linearly in the range — 2000 km s~ . 
Even though this range is far broader than the "plausi- 
ble" ranges for these parameters, I want to explore pa- 
rameter constraints with weak priors. If the data are suf- 
ficiently good, the parameter constraints should depend 
little on the prior. Since my choice of D is somewhat 
optimistic, if the parameter constraints are prior depen- 
dent for even this value of D, then parameter inference 
for 2015-era direct-detection experiments will be heavily 
prior dependent. The upper end of the range for i>i ag and 
v TXns is far above the current best estimates for the local 
escape speed from the Galaxy, v c 



550 km s" 1 [97 1 



The reconstructed m x and D are shown with the light- 
color-filled contours in Fig. [TJ panels of which were made 
using a modified version of the publicly-available COS- 
MOMC getdist code [H|]. Each column in the figure 
shows the results for a single WIMP mass, and each 
row shows a different speed-distribution benchmark. The 
68% and 95% confidence limits (C.L.) are actually cen- 
tral credible intervals, for which equal volumes of the 
posterior lie outside the upper and lower edges of the in- 
tervals 85]. This is how the C.L.'s will be defined for the 
rest of this work. Generically, it is possible to get good 
constraints on low-mass WIMPs even without strong pri- 
ors on the speed-distribution parameters t>i ag and t> rms , 
although the constraints for m x = 50 GeV are much 
tighter for the SHM that the other equilibrium halo mod- 
els. However, the constraints if m x = 50 GeV are poor 
if vi ag =v TIns = 280 km s _1 . 

The constraints for m x > 100 GeV are much poorer 
than for m x < 50 GeV. In general, it is only possi- 
ble to find a lower limit for the WIMP mass and cross 
section. This is because the typical recoil energy is 
Q ~ pAV\ ag /mA, where p. a is the reduced mass for the 
WIMP-nucleon system. For m x /mA 3> 1, pa ~^ ttia- 
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FIG. 1: Marginalized probability distributions for m x and D = p x a% 1 /trip. The 68% C.L. region is darker than the 95% C.L. 
region. The lighter pair of contours is associated with WIMP searches in the analysis windows described in Sec. Ill CI and the 
darker pair is associated with extending the analysis window to 1 MeV. Each row of figures corresponds to a different WIMP 
speed-distribution benchmark model. For each, a single Maxwell-Boltzmann velocity distribution is assumed (of the form in 
Eq. (©), but with different ui ag and v Tms . The x's mark the input m x and a^assuming p x = 0.3 GeV cm -3 . 
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FIG. 2: Recoil energy spectra for a xenon-based experiment for different m x , ui ag , and v IYns . Each column represents a different 
m x , and each row represents a different t>i ag . The lines on the plots represent different i> r ms, with (thinnest to thickest) 
Urms = 50, 100, 150, 200, 250 km s _1 . The shaded region shows the XENON10 analysis window [||. 



Thus, the recoil spectrum is independent of WIMP mass 
for sufficiently high-mass WIMPs. This point is illus- 
trated in Fig. [2] in which I plot recoil spectra of xenon 
as a function of m x , i>i ag , and v rms . Each column repre- 
sents a different WIMP mass, each row a different vi as , 
and the line thickness signifies the value of v rms . The 
shaded region is the analysis window for the XENON10 
experiment, the analysis window I use as the default for 
the LUX- like toy experiment [98j . 

The shapes of the recoil spectra in and outside of the 
analysis window in Fig. [5] indicate a possible way to 
more tightly constrain the WIMP parameters: extend 
the analysis windows to higher energy. A larger analysis 
window gives one a longer lever arm on the recoil spec- 
trum. In Fig. [5J there are a number of recoil spectra 
that look nearly identical inside the analysis window but 
diverge outside. Even a few recorded events at high re- 
coil energy could prove useful in parameter constraints. 
The darker set of contours in Fig. [1] indicate parameter 
constraints when the upper end of the analysis window is 
extended to Q max = 1 MeV for all experiments. There is 
only a modest increase in the number of events relative 
to the number of events in the fiducial analysis windows 
(~ 5% — 25% depending on the WIMP mass, target nu- 
cleus, and speed distribution) , but the constraints in the 
m x —D plane are obviously significantly better, especially 



for the m x = 50 GeV and 100 GeV cases. The question is 
if backgrounds at higher energies will limit the constrain- 
ing power of the high-energy nuclear recoils. I defer that 
subject to future work. 

Next, I examine the constraints on the WIMP speed- 
distribution parameters v\ ag and f rms , which are shown 
in Fig. [3] As in Fig. [1] the lighter set of marginalized 
probability contours corresponds to the fiducial analysis 
windows, and the darker set of contours corresponds to 
the analysis in which Q max is increased to 1 MeV. As in 
the m x —D plane, the constraints are tighter for higher 
Qmax- The speed-distribution constraints are generally 
better for low-mass WIMPs than for m x = 500 GeV. For 
the lower-mass WIMPs, there is a long tail in the pos- 
terior towards small v\ ag . This has to do with the fact 
that although the typical WIMP speed ui ag is important 
in setting the typical energy scale of the events, the dis- 
tribution of speeds v rms governs the shape of the recoil 
spectrum. For example, if the WIMP distribution func- 
tion were a delta function centered on Ki ag (the limit of in- 
finitely small n rms ), the recoil spectrum divided through 
by Fgj(Q) would be a step function that cuts off when 
Vmin exceeds Ui ag . If, however, the distribution function 
were flat up to some cut-off such that the typical speed 
were ui ag (the limit of large u rms ), there would be a longer 
tail in the recoil spectrum to higher Q since this distri- 
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FIG. 3: Marginalized probability distributions for ui ag and « rms assuming a single Maxwellian speed distribution. Contours 
show 68% and 95% C.L regions. The lighter pair of contours is associated with WIMP searches in the analysis windows 
described in Sec. Ill CI and the darker pair are associated with extending the analysis window to 1 MeV. Each row of figures 



corresponds to a different benchmark WIMP velocity model with (top to bottom): ui ag = 220 km s 1 



vi^ g — 220 km s 1 



9 = 220 km s" 1 ; «i ag = 280 km s _1 



3 = 200 km s _1 ; and tw= 280 km s _1 



155 km s 1 ; 
3 = 280 km s" 1 . 



Each column represents a different benchmark WIMP mass. 
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bution would have a number of high-speed WIMPs. 

There are already two conclusions we can draw from 
this study. First, one may simultaneously constrain the 
parameters of the model (m x , D, V\ as , and u rms ) from 
direct-detection data without strong priors on any of 
those parameters, assuming that the true WIMP DF 
looks something like a Maxwell-Boltzmann distribution 
and there are at least of order 100 events in all experi- 
ments combined. The constraints do vary as a function of 
all those parameters, and appear best for the SHM ver- 
sus halo models with higher «i ag and v Tlas . Second, the 
constraints improve significantly if the analysis window 
is extended to higher energies, at least if backgrounds are 
negligible. 

It is useful to see how the constraints on m x and D 
compare to the case in which strong priors are placed on 
the speed distribution, as is typically done in WIMP pa- 
rameter forecasts [4(| |45| . I consider a prior that consists 
of a Gaussian for v\ &g centered at 220 km s" 1 and with 
a width of 22 km s _1 multiplied by a Gaussian prior on 
P centered on V2 with a width of 0.4. The width of the 
prior on v\ ag is the IAU value of the uncertainty on the 
speed of the LSR, and the prior on /3 spans the values 
expected for a Navarro-Frenk- White profile. In Fig. [H I 
show the constraints in the m x —D plane using Q max = 1 
MeV with this prior (light-colored filled contours) and the 
constraints without the prior with the same Q max . The 
constraints on m x and D are not significantly better with 
the inclusion of the strong prior — the data are sufficient 
for the likelihood to influence the posterior away from 
the prior, although not entirely. This also means that the 
strong prior does not significantly bias the constraints on 
m x and D. The only case in which the prior does some- 
what improve the fit is for the SHM, which is unsurpris- 
ing because the priors are centered on SHM parameters. 
The takeaway message from Fig. [4] is that imposing a 
strong prior on the speed-distribution parameters is un- 
necessary, at least if there are at least of order 100 events 
in all experiments combined. If there are fewer events, 
the parameter constraints may be prior dominated if a 
strong prior is imposed. 

Alternatively, one may view the particle-physics pa- 
rameters m x and D as being nuisance parameters if the 
goal is to determine the WIMP speed distribution as 
well as possible from the direct-detection data. In Fig. 
[5J I consider the marginalized probabilities of t>i ag and 
j) rms when imposing a Gaussian prior on m x centered 
on the true value, with the width on the prior set to 
0.1to x . This is the range of uncertainty on the WIMP 
mass one might achieve if supersymmetry is discov- 
ered at the Large Hadron Collider (LHC) [99|. As in 
Fig. [3l the lighter-filled pair of contours corresponds 
to the fiducial analysis windows, and the darker-filled 
contours correspond to setting Qmax= 1 MeV. In gen- 
eral, the mass prior sharpens the v rms probability dis- 
tribution but only alters the constraints in the v\ ag - 
direction a little. This is somewhat disappointing be- 
cause it means that we will likely only obtain an upper 



limit on ui ag with 2015-era direct-detection experiments. 
The mass prior most strongly affects the probability dis- 
tribution of the speed parameters for high-mass WIMPs 
because it down weights the speed parameters preferred 
by the low-m x tail in the posteriors in Fig. [3] 

I have also checked the parameter constraints in the 
case that the speed distribution deviates significantly 
from the SHM. For either a SDD or a high-«i ag , small- 
ii rms velocity stream, sampling v rTas and «i ag linearly in the 
region — 2000 km s _1 in MultiNest leads to good con- 
straints on both the WIMP particle-physics parameters 
and on wi ag and w rms . The only case for which constraints 
are poor (at least for the fiducial D) occurs when the 
typical recoil energy Q = [^A v \a,J m A ^ es near or below 
the energy threshold Q max for the lower-threshold exper- 
iments. This constraint improves with larger D, though. 
Moreover, I have examined the profile likelihoods in ad- 
dition to the marginalized posteriors, and find the shape 
of the profile likelihoods and the marginalized posteriors 
to be broadly consistent regardless of the actual values 
of ra x , uiag, and 

B. Multimodal distribution in, Maxwell-Boltzmann 
distribution out 

So far, I have only considered the case in which the 
hypothesis for the form of speed distribution matches its 
actual form. However, there are strong reasons to believe 
that the DF could be multimodal. High-resolution dark- 
matter-only simulations show that there are spatially- 
varying (on ~ kpc scales) bumps and wiggles in the 
WIMP velocity distribution, imprints of the halo's accre- 
tion history and the tidal stripping of subhalos [48|, |49j . 
Simulations of Milky Way-mass dark-matter halos that 
include baryons show that there exists an additional 
macrostructure, a dark disk formed through the drag- 
ging and disruption of satellites in the disk plane of the 
galaxy [HO, [z3l ■ Moreover, the Milky Way is still accret- 
ing more small halos, which can disrupt and form tidal 
streams on small scales that have not yet phase mixed. 
The key point of this section is to determine how badly 
m x and D are biased if one makes the ansatz of a sin- 
gle Maxwell-Boltzmann distribution function even if the 
distribution function is multimodal. 

I examine two multimodal velocity distributions. First, 
I consider a model in which half of local WIMPs are de- 
scribed by the SHM and half by the SDD. I keep D fixed 
to 3 x 10 -45 GeV cm -1 , so that the combination of p x 
and (jp 1 remain the same as in Sec. IIII Al This model 
will be called the "SHM + SDD" model. Second, I con- 
sider a model in which half the local WIMPs have a SHM 
distribution function, 30% have a SDD distribution func- 
tion, 10% have a Maxwell-Boltzmann distribution with 
l 'iag= 400 km s^ 1 and v Tms = 50 km s _1 , and 10% have a 
Maxwell-Boltzmann distribution with i | i a g= 500 km s _1 
and v lr[LS = 50 km s~ . These latter two distributions are 
supposed to represent tidal streams. This model will be 
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FIG. 4: Marginalized probability distributions for m x and D = p x ap ! /rrip with Maxwellian speed distributions for the WIMPs 
as the benchmark models and as the hypothesis. The lines outline 68% and 95% C.L. regions. The darker pair of contours is 
associated with WIMP searches with flat priors on the velocity parameters with Q ma x = 1 MeV for all experiments, and the 
lighter pair are associated with a 10% Gaussian prior on v\ ag = 220 km s" 1 and a prior on j3 (Eq. 1131 see text for details). 
Each row of figures corresponds to a different benchmark WIMP velocity distribution, and each column represents a different 
benchmark WIMP mass. 
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FIG. 5: Marginalized probability distributions for i>i ag and v rlnB assuming single Maxwellian speed distributions as both the 
benchmark models and the hypothesis, and assuming a Gaussian prior on the WIMP mass centered on the benchmark values 
with the width of the Gaussian equal to 0.1m x . The lighter pair of contours is associated with WIMP searches in the analysis 
windows described in Sec. Ill CI and the darker pair are associated with extending the analysis window to 1 MeV. The regions 
denote 68% and 95% C.L. regions. Each row of figures corresponds to a different WIMP velocity model with (top to bottom): 
v\ ag — 220 km s _1 & tii ms = 155 km s _1 ; v\ m = 220 km s _1 & w rm! = 220 km s _1 ; vi ag = 280 km s _1 & i> r ms= 200 km s _1 ; and 
v\^ g — 280 km s" 1 & n ms = 280 km s _1 . Each column represents a different benchmark WIMP mass. 
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called "SHM + SDD + 2 streams." 

I create mock data sets for each of these models for 
m x = 50,100, and 500 GeV, and analyze the data sets 
with the hypothesis of a single Maxwell-Boltzmann DF. 
As in Sec. MI A[ I sample m x and D logarithmically, and 
flag and v TmB linearly. The two-dimensional marginalized 
probability distributions for m x and D are shown in Fig. 
11 For the SHM + SDD model, the center of the m x -D 
probability distribution is offset from the true values by 
~ 50% for m x = 50 and 100 GeV, for either the fiducial 
Qmax or for Q max = 1 MeV. The direct-detection experi- 
ments are generally not overly constraining in the m x —D 
plane for m x /m j 4 3> 1, regardless of the true speed dis- 
tribution. 

The lower panels in Fig. [6] show the constraints for the 
SHM + SDD + 2 streams input distribution function. 
The centers of the probability contours arc offset for m x — 
50 and 100 GeV, although not as much as for the SHM 
+ SDD model. This is because events from the high- 
velocity streams populate the high-Q end of the recoil 
spectrum, which balances out the low-Q dominance of 
the SDD. 

For the two examples explored in this section, using 
the ansatz of a single, smooth distribution function even 
if the actual distribution function is multimodal leads 
to ~ 50% biases in m x and D. Although these biases 
are not as disastrous as they could have been, there are 
reasons to disfavor using the single-mode distribution- 
function ansatz for what in reality is likely to be a mul- 
timodal distribution function. First, one would really 
like to obtain unbiased estimates for the particle-physics 
parameters for purposes of accurate dark-matter identi- 
fication. Additionally, one loses information about the 
actual speed distribution by forcing a particular form on 
the data. This would be a shame, since direct-detection 
experiments and neutrino-telescope observations of the 
Sun and Earth provide the only way to probe the speed 
distribution. This is likely the only way we will ever know 
if the Milky Way has a dark disk, or if there is a signif- 
icant amount of microstructure in the Galactic WIMP 
distribution function. The next task is to determine if 
we can find an empirical hypothesis for the local WIMP 
speed distribution that fits the data better. 



C. Empirical speed distributions and hypothesis 
testing 

The two goals of this section are to get a sense of 
how effective empirical speed-distribution models are at 
recovering the WIMP speed distribution, m x , and D; 
and to determine if one can tell empirically if the DF 
is not well described by a smooth halo model. When 
one does not have an overwhelmingly well-supported the- 
oretical hypothesis, as is the case for the local WIMP 
DF, it is good to adopt simple, more empirical hypothe- 
ses. This is the approach recommended by the Joint 
Dark Energy Mission Figure of Merit Science Working 



Group — instead of forecasting constraints a particular 
quintessence-inspired equation-of-state evolution func- 
tion for the dark energy (whose nature is perhaps even 
less constrained than dark matter), they recommend con- 
straining the equation of states in redshift bins [l00| . 

I adopt a similar approach for the WIMP speed distri- 
bution. In particular, I use a step-function model for the 
speed distribution in the parameter search. I focus on 
constraining the coefficients gi for a step-function form 
of the speed distribution, 

9(v) = Y^gi&{v-Vi)Q(vi +1 -v), (14) 
»=i 

where vi is the lower limit of the speed for the zth g(v) 
bin, Uj+i is the upper limit. is the Heaviside step func- 
tion. The hat symbol denotes the fact that this speed 
distribution is estimated from the data regardless of the 
true g(v). In the limit of an infinite number of bins 
N g — > oo, g(v) — > g(v). In this work, I choose bins of 
equal size in v. Either the step- function model or the 
choice of binning may be far from the optimal empiri- 
cal parametrization of the speed distribution, but these 
choices for the speed-distribution analysis serve the pur- 
pose of providing a good proof of principle for WIMP 
speed-distribution recovery and model comparison. 

For this work, I first consider five bins in speed up to 
v = 1000 km s _1 . This upper limit is somewhat larger 
than the estimated escape speed from the Milky Way in 
a geocentric frame [9^ |. By setting the maximum speed 
for the speed-distribution bins, I am placing a strong 
prior that the maximum WIMP speed must lie below 
that value. As in the previous sections, I choose the 
usual benchmarks for WIMP mass, m x = 50, 100, and 
500 GeV, and fix D = 3 x 10~ 45 GeV _1 cm _1 . As before, 
I sampled those parameters logarithmically using Multi- 
Nest. I chose three different benchmarks for the speed 
distributions for the mock data sets: the SHM, the SHM 
+ SDD, and the SHM + SDD + two high-speed velocity 
streams (with the same weighting of components as used 
in Sec. IIIIBI) . I sampled the five velocity-bin coefficients 
{gi} linearly in the range from to {<?f lax }, where g™ ax 
is the maximum value of gi if all other gj^i = and 
satisfying the normalization condition in Eq. (j4]). While 
the marginalized 68% and 95% confidence-level regions in 
the speed coefficients are not dramatically different if one 
samples the {gi} logarithmically, the marginalized con- 
tours generally follow the shape of the profile likelihood 
better for linear scans in {gi}. 

The first benchmark speed distribution I consider is 
the SHM. The constraints in the m x —D plane are shown 
in Fig. [Jj and the constraints on {gi} are shown in Fig. 
[5] In Fig. [7J I show the marginalized probabilities for 
the fiducial Q max with the set of light-colored filled re- 
gions, and the marginalized probabilities for Q niax = 1 
MeV with the darker pair of regions. The first thing 
to note is that the parameter uncertainties are no larger 
than those found in Sec. IIII Al although they are biased 
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FIG. 6: Marginalized probability distributions for m x and D. The 68% C.L. contours are darker than the 95% C.L. contours. 
The lighter pair of contours is associated with WIMP searches in the analysis windows described in Sec. Ill CI and the darker pair 
are associated with extending the analysis window to 1 MeV. The top panels represent the SHM + SDD benchmark WIMP 
speed distribution, and the bottom panels represent the SHM + SDD + 2 streams benchmark WIMP speed distribution. 
Parameters inferred using the hypothesis of a single Maxwell-Boltzmann population of WIMPs. See text for details. 



in the cases of m x = 50 and 100 GeV. The bias decreases 
with increasing Q max , though. The second thing to note 
is that the shape of the degeneracy contours is quite dif- 
ferent than with the Maxwell-Boltzmann ansatz used in 
Sec. IIII Al This is because the shape of the mapping be- 
tween m x and D and a fixed recoil spectrum depends on 



the form of the speed distribution. Third, for m x 



50 



GeV there are disconnected regions. This is an artifact 
of the "realization noise" in the data. 

The reconstructed speed distributions are shown in 
Fig. [5J Each column in the figure represents a different 
WIMP mass. The error bars represent the marginalized 
68% probability limits for each gi. Note that the proba- 
bility contours are in fact correlated. The solid error bars 
denote the limits obtained with the fiducial <5 ma x, and 
the dotted error bars denote those obtained if Q m ax= 1 
MeV. In the upper panels, the WIMP mass is only con- 
strained to be somewhere between 1 MeV and 100 TeV, 
but in the bottom panels, I impose a Gaussian prior on 
the WIMP mass centered on the true value and with a 
width of 0.1w x . The solid line shows the SHM speed dis- 
tribution. In general, using the higher Q max leads to bet- 
ter fits to the SHM speed distribution, with the exception 
of the case in which m x = 50 GeV. I note that a similar 



trend towards a larger low-speed population is also seen 
in the Maxwell-Boltzmann analysis in Sec. IIII Al for this 
particular benchmark. Figures [3] and [5] show that the 
true speed-parameter point barely lies within the 95% 
C.L. contour. This high inferred density of low-speed 
particles is an artifact of this particular realization of the 
data for this set of benchmark parameters. 

Although the inferred speed distributions look reason- 
able, one might want to ask if the inferred speed distri- 
bution were consistent with Maxwell-Boltzmann distri- 
bution. The issue of model selection is tricky for both 
the frequentist and Bayesian perspectives if o ne ca nnot 
use x 2 to determine the goodness of fit (e.g., 82lll0il |). In 
general, the goal is to determine the relative fit between 
hypotheses instead of determining the absolute quality 
of fit for a single hypothesis. I use three different crite- 
ria to assess the relative quality of fit between the single 
Maxwell-Boltzmann and step- function speed-distribution 
hypotheses: t he B ayes factor, the Akaike information cri- 
terion (AIC) |l02j |. and the Bayes information criterion 
(BIC) [l04j . However, for reasons stated below, I will 
emphasize the Bayes factor in particular. 

In the Bayesian context, the ratio of Bayesian evi- 
dences [Eq. (fTTj) ] for two hypotheses ("Bayes factor") 
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FIG. 7: Marginalized probability distributions for m x and D with the SHM as the velocity benchmark and analyzed with 
the five-bin step- function speed-distribution hypothesis. Each panel represents a different benchmark m x . The lighter pair of 
contours represents 68% and 95% C.L. regions based on the analysis windows described in Sec. Ill CI and the darker pair of 
contours are the results if the analysis windows are extended to 1 MeV. 



is often used to determine if one hypothesis fits the 
data better than the other. Since the Bayesian evidence 
is just the average likelihood over the parameter space 
(weighted by the prior), the better-fit hypothesis is as- 
sumed to be the one with the higher average likelihood, 
regardless of maximum likelihood C max . This means 
that models with fewer parameters are generally pre- 
ferred (the "Occam's razor" hypothesis — simpler models 
are better). Technically, the Bayes factor is not strictly 
the ratio of evidences, but is the ratio of evidences multi- 
plied by the ratio of the priors on the hypotheses. Quan- 
tifying the belief in the hypoth eses i s something I will not 
get into in this work, but Ref. [I0l| provides an interest- 
ing introduction to the subject. For now, I will assume 
that the hypotheses are equally probable. 

In general, the evidence is prior and parameter de- 
pendent. In the present case, determining whether 
the step-function hypothesis fits better than a Maxwell- 
Boltzmann distribution, the fact that WIMPs cannot 
travel with infinite speed allows one to at least define 
a reasonable parameter volume. For the step-function 
speed model, the {gi} cannot exceed {gf 13 *}- This pro- 
vides a natural volume to use, and was the volume I 
used to for the parameter search. As with the parameter 
search, I use flat priors on {gi] to calculate the evidence. 
To calculate the evidence for the Maxwell-Boltzmann 
model, I use the same priors and parameter-space vol- 
ume as used in the parameter search in Sec. IIII Al The 
upper bound for v\ ag and w rms are well above the escape 
speed from the Galaxy. 

Although I use the Bayes factor 

ZjlMB) 
z ({9i}) 

to get a sense the relative fit of the single Maxwell- 



Boltzmann (1MB) and step-function ({gi}) models, more 
in-depth studies are necessary to determine if this is 
really the best fit criterion for direct-detection data. 
Moreover, even though the way in which I have de- 
fined the prior volume is reasonable, it may not be the 
best; vi a g and w rms are a completely different way of 
parametrizing a speed distribution than {gi}. However, 
as I show below, the Bayes factor seems to be a not un- 
reasonable criterion by which to classify fits. 

Second, I consider the AIC, which is approximated as 



AIC = -21n£„ 



2N, 



(16) 



where £ max is the maximum likelihood for the data given 
the hypothesis, and N p is the number of parameters of the 
hypothesis. The AIC is meant to minimize the Kullback- 
Leibler information entropy jl03l |. and so the hypothe- 
sis with the smallest AIC is preferred. As with most 
Bayesian model-selection criteria, the AIC penalizes the 
introduction of additional parameters, but not as much 
as the BIC, the third model-selection criterion I consider, 
which is defined as 



BIC 



-2 mA, 



N p ln(N ) 



(17) 



Here, N Q is the observed number of events. For the data 
sets I consider, there are between ~ 200 and ~ 700 total 
events, which gives ln(A ) ~ 6. In the limit that the 
posterior is a multivariate Gaussian and that the data 
are independent and identically distributed, the Bayesian 
evidence and BIC are equivalent in terms of describing 
the quality of the fit [82|] ■ 

Even though I consider all three Bayesian model- 
selection criteria below, I emphasize the Bayes factor be- 
cause it is easiest to interpret and most likely to select 
the better model. The AIC does not necessarily select the 
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TABLE I: Bayes' factor for benchmark speed distributions 
and WIMP masses 



FIG. 8: Inferred WIMP speed distributions for the SHM 
benchmark model. The solid error bars denote the 68% C.L. 
region for each <?i using the fiducial analysis windows, and 
the dashed error bars show the same but for Q ma x = 1 MeV. 
The upper panels show the speed constraints when the WIMP 
mass and cross section are sampled logarithmically, and the 
bottom panels show the speed constraints when there is an 
additional 10% Gaussian prior on the WIMP mass. The solid 
line denotes the benchmark speed distribution. 



corr ect m odel even if one had an infinite, unbiased data 
set jl05l |. The issue with the BIC is that the posteriors 
for the direct-detection data sets are clearly not well de- 
scribed by multivariate Gaussians, and so it is not clear 
how then to interpret the BIC. In cosmology, the Bayes 
factor is t he preferred Bayesian model-selection criterion 
106M108 1 . I show the Bayes factor for each benchmark 



model in Table [U 

For all the SHM data sets, the SHM is preferred 
over the five-bin step- function speed-distribution hypoth- 
esis by the AIC and the BIC. However, the preference 
is not especially strong according to the Bayes factor. 
The mock data sets with the highest B are those with 
m x = 50 GeV with the fiducial Q max and m x = 100 GeV 
with Q max = 1 MeV, for which ln(£>) w 3, which is al- 
most considered "moderate" evidence on the Jeffreys' 
scale in favor of the SHM [82]. In the cases in which 
m x = 500 GeV, there is weak evidence for the step- 
function model; it is, however, not especially significant 
since any \\nB\ < 3 is considered weak evidence. 

Next, I consider the multimodal distribution functions 
I explored in Sec. IIII Bl The constraints for the SHM + 
SDD benchmark model in the m x ~D plane are shown in 
Fig. [9j and the speed-distribution fits are shown in Fig. 
[TU1 As in Fig. the lighter pair of contours indicate 
the marginalized probabilities of the parameters for the 
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fiducial values of Q ma x, and the darker pair correspond 
to setting Q mllx — 1 MeV. The probability contours in 
the m x —D plane are offset from the true point, with 
the exception of the m x = 500 GeV cases. The offsets 
are somewhat less than if one were to apply the ansatz 
that the velocity distribution is Maxwcll-Boltzmann (Fig. 
0, but not much. The offsets are lower if one uses a 
higher Q max . The probability contours have a different 
shape than those resulting from the hypothesis that the 
velocities have a Maxwell-Boltzmann distribution. 

The inferred speed distribution is shown in Fig. I1Q[ 
in which the lines and error bars have the same meaning 
as in Fig. [H While the inferred speed distribution is 
reasonable for the speed bins above 200 km s~\ and al- 
though the fits appear better for the cases of high Q max , 
the first speed bin is always systematically low. The 
main reason for this is that the speeds below approxi- 
mately 100 km s _1 are actually quite poorly constrained 
by the data. This is because the typical recoil energy for 
a WIMP moving 100 km s^ 1 with respect to the experi- 
ment is 



Q-O.l-^-^keV. 
1 GeV rriA 



(18) 



Thus, most of the low-speed WIMPs will scatter below 
Qmin, especially for the argon experiment, and so the 
lowest-speed bin in Fig. [TU] actually reflects WIMPs in 
the speed range v = 100 — 200 km s _1 . 

This point is further illustrated in Fig. [TTJ in which I 
show the inferred speed distribution for a step-function 
speed-distribution hypothesis with ten speed bins. In this 
case, the true WIMP mass is 500 GeV and I used the fidu- 
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FIG. 9: Marginalized probability distributions for m x and D for the SHM + SDD benchmark speed-distribution model and 
the five-bin step- function speed-distribution hypothesis. Each panel represents a different benchmark m x . The lighter pair of 
contours represents 68% and 95% C.L. regions based on the analysis windows described in Sec. Ill CI and the darker pair of 
contours are the results if the analysis windows are extended to 1 MeV. 



cial Qmax for the experiments. In this figure, the 68% 
C.L. region for the lowest-speed bin is enormous com- 
pared to the other bins. The v = 100 — 200 km s _1 bin is 
well centered on its true value. Thus, the systematically 
low value of the speed distribution for the lowest-speed 
bin in the five-bin model is an artifact of the fact that the 
experiments cannot really constrain WIMPs that scatter 
below threshold. 

The lack of sensitivity to the lowest-speed WIMPs also 
explains several features of the m x — D parameter con- 
straints. For the m x = 50 and 100 GeV cases, both 
m x and D tend to be too low, with multiple peaks in 
the posterior for m x = 100 and 500 GeV in Fig. [§] (and 
for m x = 50 GeV in Fig. [7j. For the SuperCDMS- 
and LUX-inspired experiments, the threshold recoil en- 
ergy Qmax lies right on the transition between the lowest 
two speed bins for m x s» 80 GeV in the five-bin hypothe- 
sis. For lower-mass WIMPs, those experiments are com- 
pletely insensitive to the lowest-speed bin, and for higher 
mass WIMPs, the experiments are sensitive to speeds 
only at the upper edge of the lowest-speed bin. However, 
for the SHM + SDD model, a relatively large fraction 
of the WIMPs are actually in the lowest-speed bin. The 
multiple peaks in the posterior appear to be associated 
with this transition in sensitivity to the lowest-speed bin, 
especially since this particular empirical description of 
the speed distribution is discontinuous. The cross section 
is biased low for the following reason. Since the differ- 
ential event rate is highest near threshold, the constraint 
on the speed bin just above threshold is strong and is 
more influenced by the lower speed WIMPs in the bin. 
Since the event rate goes as ~ J g{v)vdv and the num- 
ber density of particles goes as ~ g(v)v 2 dv, the number 
of WIMPs in the second-lowest-speed bin is biased high 



while the number of WIMPs in the lowest-speed bin is 
biased low (also due to the fact that the experiments are 
sensitive only to speeds at the upper edge of the speed 
bin). The cross section must drop to compensate for the 
relatively high number of WIMPs inferred in the second- 
lowest-speed bin. 

Next, I consider the question of model selection. I cal- 
culate the Bayes factor, AIC, and BIC for the SHM + 
SDD data sets. I find that the Bayes factor indicates that 
the step-function model is a better fit for each of the six 
ensembles of mock data sets. The Bayes factor is most 
significant for m x — 50 and 100 GeV for Q max = 1 MeV. 
In those cases, ln(B) = —7 to —6 (Table HJ), which is con- 
sidered "strong evidence" on the Jeffreys' scale [g^- All 
ensembles of data sets with the exception of the single 
ensemble with m x = 500 GeV and the fiducial Q max in- 
dicate a lower AIC for the step-function model than the 
Maxwell-Boltzmann distribution. However, only those 
mock data sets corresponding to to x =50 or 100 GeV with 
Qmax= 1 MeV additionally have a lower BIC for the step- 
function model than for the Maxwell-Boltzmann distri- 
bution. Thus, while the Bayes factor and AIC generally 
show that the Maxwell-Boltzmann distribution is disfa- 
vored for this particular two-component velocity distri- 
bution, it is generally only moderately disfavored relative 
to the step-function model unless Q max is large. 

Last, I consider the SHM + SDD + 2 streams model. 
The probability contours in the m x —D plane for several 
values of m x are shown in Fig. 1121 and the speed distri- 
bution is shown in Fig. [T31 As for the SHM + SDD case, 
the contours in the m x —D plane are typically slightly off- 
set from the true point in parameter space, but are less 
offset for higher Q max . Also as for the SHM + DD case, 
the lowest-speed bin in is systematically low due to the 
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FIG. 10: Inferred WIMP speed distributions for the SHM + 
SDD benchmark model. The solid error bars denote the 68% 
C.L. region for each <?, using the fiducial analysis windows, 
and the dashed error bars show the same but for Q max = 1 
MeV. The upper panels show the speed constraints when the 
WIMP mass and cross section are sampled logarithmically, 
and the bottom panels show the speed constraints when there 
is an additional 10% Gaussian prior on the WIMP mass. The 
solid line denotes the benchmark speed distribution. 



poor constraints on the lowest-speed WIMPs, although 
the higher-speed bins are well centered on the true speed 
distribution. 

The model-selection patterns also follow those of the 
SHM + SDD benchmark speed distribution. There 
is only one exception to the patterns of the SHM + 
SDD findings. The case of m x = 100 GeV with fidu- 
cial Qmax has AIC and BIC that prefer the Maxwell- 
Boltzmann velocity distribution model (although in the 
case of the AIC, the difference between the two models is 
quite small, < 2), and the Bayes factor is m(i?) = —1.9, 
which indicates a weak-to-modcratc preference for the 
step-function model. Otherwise, the trends in \n(B), 
AIC, and BIC for the SHM + SDD input velocity model 
hold for this more complicated input velocity model, too. 
The Bayes' factors for all benchmarks are given in Table 
III 

There are a few final points I would like to address in 
this section. First, although I have shown for the mock 
data sets with multimodal velocity distributions that one 
may reasonably reconstruct a speed distribution (taking 
care with the low-speed end for which the experiments 
have little constraining power) using the step-function 
hypothesis, there is still the issue that the m x —D prob- 
ability distribution is offset from the true value. In fact, 
the true point lies outside of the 95% C.L. contour in 



FIG. 11: Speed distribution inferred from mock data for the 
SHM + SDD model and m x = 500 GeV. There are ten bins 
equally sized in v up to v — 1000 km s _1 . 



most cases I considered. This is because, even though 
the five-bin step-function speed-distribution hypothesis 
is a better fit to the data, it is by no means the best fit 
speed-distribution hypothesis for the data. In fact, the 
discontinuous nature of the speed-distribution hypothesis 
is clearly not physical and is responsible for some of the 
odder features of the probability contours, as discussed 
in this section. 

One may ask if one does better with a larger number of 
speed bins. Even though the speed-distribution hypoth- 
esis is still discontinuous, it is a better representation of 
a continuous function. I ran a set of tests in which I 
doubled the number of speed bins, from five to ten. The 
maximum likelihood £ max barely improved between the 
two sets of analyses (|Aln£ max | < 3), meaning that the 
AIC and BIC model-selection criteria would prefer the 
five-bin model over the ten-bin model. The only case in 
which £ max increased enough that the AIC preferred the 
10-bin model was for the SHM + SDD benchmark with 
m x = 100 GeV and Q max = 1 MeV. The Bayes factor 



Z(5 bins) 
Z(10 bins) 



(19) 



was ranged from nearly 1 (no preference in favor of either 
model) to ln(-Bbin) = 3, which indicates moderate pref- 
erence for the five-bin model. Even for the one case in 
which the AIC indicated the ten-bin model was a better 
fit, the Bayes factor indicated a preference for the five- 
bin model. Usually, the Bayes factor was less significant 
in distinguishing between the hypotheses of the num- 
ber of bins in the step-function speed-distribution model 
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FIG. 12: Marginalized probability distributions for m x and D for the SHM + SDD + 2 streams benchmark model and the 
five-bin step-function speed-distribution hypothesis. Each panel represents a different input mass. The lighter pair of contours 
represents 68% and 95% C.L. regions based on the analysis windows described in Sec. Ill CI and the darker pair of contours are 
the results if the analysis windows are extended to 1 MeV. 



than distinguishing between the five-bin model and the 
Maxwcll-Boltzmann distribution. It appears that for the 
mock data sets I considered (and what is likely to hold 
true for the first years of real data), ten speed bins is 
likely overkill. The one interesting feature of the ten-bin 
model was that the m x ~D probability distributions were 
better centered on the true value than for the five-bin 
model, although the posterior is still multimodal. This is 
illustrated in Fig. 1141 in which I show the probability dis- 
tributions for the SHM + SDD benchmark for m x = 100 
GeV. However, this needs to be explored for more ensem- 
bles of mock data sets to see if that is generally true. 

Although I have shown that one may achieve demon- 
strably better fits to multimodal velocity distributions 
using the step-function speed-distribution hypothesis, I 
have not shown that this is truly the best empirical 
speed-distribution model one could use. In fact, the dis- 
continuous nature of this empirical function has clear 
downsides, in addition to only being able to approximate 
theoretically-inspired functional forms for the speed dis- 
tribution in the limit of many bins. In practice, it is 
likely that direct-detection data sets will need to be ana- 
lyzed with a variety of empirical hypotheses for the speed 
distribution in order to achieve the best, unbiased con- 
straints on both the particle-physics parameters (m x , D) 
and the speed distribution. I leave the development of a 
strategy for optimal model selection to future work. 

The key points of this section are that one may ob- 
tain a reasonable estimate of the WIMP speed distribu- 
tion using these simple step-function speed-distribution 
models, and that one may distinguish between speed- 
distribution models using Bayesian model-selection cri- 
teria. In particular, the simple step-function model is 
moderately to strongly preferred over the single- velocity- 



component model for the benchmark multimodal velocity 
distributions. 



IV. DISCUSSION 

In this work, I studied the prospects of inferring the 
WIMP speed distribution in addition to the WIMP 
particle-physics parameters for several benchmark mod- 
els from direct-detection data sets. I created mock data 
sets for idealized versions of cryogenic and liquid-noble 
experiments expected to be on line by or near 2015. I 
applied Bayesian inference to estimate WIMP parameter 
values and uncertainties from the mock data sets. 

There were three cases I considered. In the first case, 
I considered constraints on WIMP particle-physics and 
speed-distribution parameters in the case that hypoth- 
esis for the speed distribution matched the actual form 
of the speed distribution, but for which the parameters 
of the speed distribution were otherwise unconstrained. 
The motivation for this study was twofold. First, most 
parameter forecasts for direct-detection experiments have 
focused on the hypothesis of a smooth halo WIMP DF, 
with the speed parameters v\ ag and v lms fixed to some- 
thing like the SHM If the speed-distribution 
parameters were allowed to vary at all, it was typically 
not over a wide range. Thus, I wanted to explore a num- 
ber of benchmark halo DF scenarios with weak priors on 
the parameters to see how well one could infer both the 
speed-distribution and WIMP particle-physics parame- 
ters. Although Sec. IIII Al focused on benchmark speed- 
distribution models that spanned a reasonable range for 
a smooth halo hypothesis, I have also considered other 
single-mode models (e.g., if the dark disk or a single large 
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FIG. 13: Inferred WIMP speed distributions for the SHM 
+ SDD + 2 streams benchmark model. The solid error bars 
denote the 68% C.L. region for each gi using the fiducial anal- 
ysis windows, and the dashed error bars show the same but 
for Q m ax = 1 MeV. The upper panels show the speed con- 
straints when the WIMP mass and cross section are sampled 
logarithmically, and the bottom panels show the speed con- 
straints when there is an additional 10% Gaussian prior on 
the WIMP mass. The solid line denotes the benchmark speed 
distribution. 



velocity stream dominates the local DF) and found sim- 
ilar results. Second, the best parameter constraints are 
obtained when the hypothesis for the speed-distribution 
model is correct, and I wanted to know how good those 
best constraints are likely to be for a variety of bench- 
mark speed-distribution and particle-physics parameter 
sets. 

I found that for any benchmark speed-distribution 
model, that one could get reasonable constraints on 
m x and D with only the weakest of priors on the the- 
oretical parameters, and that constraints improved sig- 
nificantly if Qmax was set quite high. Constraints were 
tightest for m x < 100 GeV and for vi ag < 280 km s _1 . In 
general, the degree of uncertainty in m x and D depends 
on the underlying speed distribution. 

The constraints on «i ag and v Ilas depend on the un- 
derlying speed distribution as well as m x and D. Con- 
straints were generally tighter for smaller m x and v lms . 
In general, u rms is far better constrained than Di ag ; it is 
only possible to place an upper limit on v\ ag . This was 
true even when I introduced a strong prior on the WIMP 
mass, a prior of the sort one would expect if supersym- 
metry were discovered at the LHC. This prior sharpened 
constraints on v rms but not ^i ag unless to x > 100 GeV. 
The somewhat sobering conclusion is that while it is pos- 
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FIG. 14: Marginalized probability distributions for m x and 
D for m x = 100 GeV and SHM + SDD benchmark model and 
the ten-bin step-function speed-distribution hypothesis. The 
lighter pair of contours represents 68% and 95% C.L. regions 
based on the analysis windows described in Sec. Ill CI and the 
darker pair of contours are the results if the analysis windows 
are extended to 1 MeV. 



sible to get good constraints on the velocity dispersion of 
WIMPs, it will be significantly harder to determine the 
typical speed of WIMPs with respect to the Earth with 
2015-era experiments. 

Second, in Sec. IIIIBI I considered what constraints 
one would obtain under the hypothesis of a Maxwell- 
Boltzmann distribution in the case that the true DF were 
multimodal. The motivation for studying this case is the 
strong theoretical prior on the form of the WIMP DF 
that usually goes into direct-detection parameter fore- 
casts. For both the SHM + SDD and SHM + SDD + 
2 streams multimodal DF models, I found that m x and 
D were biased low by ~ 50%, mostly due to the dark- 
disk component, something that was also found by Ref. 
[44| . Although these offsets are not enormous for these 
particular WIMP DF models, they could be more severe 
for other models. Moreover, we lose information about 
the nature of the WIMP speed distribution, such as its 
multimodal character, if we restrict ourselves to a single- 
velocity-component hypothesis. 

Finally, in Sec. IIII CI I considered a simple empir- 
ical model for the speed distribution, both to get a 
sense of how well one may recover WIMP particle-physics 
and speed-distribution properties as well as its use for 
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Bayesian model selection. I showed that the five-bin 
step-function speed-distribution model successfully re- 
produced the WIMP speed distribution for speeds above 
v = 200 km s _1 , and that the bias in the v — — 
200 km s _1 bin was due to the experiments' lack of sen- 
sitivity to WIMPs with v < 100 km s" 1 . For all bench- 
mark speed-distribution models I considered, the inferred 
values of m x and D were biased relative to their true 
values, but no more so than with the single Maxwell- 
Boltzmann hypothesis used in Sec. MI Bl However, the 
biases were less severe if the analysis window was ex- 
tended to higher energies. Moreover, there were hints 
from the ten-bin step-function model that the bias was 
less than for the five-bin model, even though the Occam's 
razor philosophy of the Bayesian model-selection criteria 
I considered favored the five-bin model. The bias should 
decrease as better empirical speed-distribution hypothe- 
ses are found. 

I found that Bayesian model-selection criteria were 
largely successful in ranking hypotheses for the speed 
distribution. For the SHM benchmarks, the Bayesian 
model-selection criteria indicated that the Maxwell- 
Boltzmann hypothesis was a better fit to the data than 
the step-function hypothesis, although the significance of 
the preference was not strong. For the multimodal bench- 
marks, these model-selection criteria showed moderate 
to strong preference for the step-function model over the 
Maxwcll-Boltzmann model. Moreover, they showed that 
doubling the number of bins in the step-function model 
did not improve the fit over the five-bin model enough 
to justify the additional bins, except for the one case 
mentioned at the end of Sec. MI CI These findings are 
interesting for several reasons. First, they show that it is 
possible to get reasonable constraints on the speed distri- 
bution with few-parameter empirical speed-distribution 
hypotheses. This is good because we do not really know 
what to expect for the WIMP speed distribution. Sec- 
ond, they show that it is possible to rule in or out popular 
theoretical models for the WIMP DF. 



A. Outstanding issues 

Although I have shown that it is possible to distinguish 
between Maxwcll-Boltzmann and step-function WIMP 
speed-distribution hypotheses with a modest amount of 
direct-detection data, there are still a number of ques- 
tions regarding model selection for the speed distribu- 
tion. 

First, although I have shown that the five-bin step- 
function speed-distribution hypothesis yields reasonable 
constraints on the speed distribution and only moder- 
ately biased constraints on m x and D, I have not shown 
that it is the best hypothesis for the speed distribution. 
In fact, it cannot be the best hypothesis, since the best 
hypothesis would be that which matched the form of the 
benchmark speed distributions. Moreover, its discontin- 
uous nature is both nonphysical and creates strange fea- 



tures in the posterior, such as the multimodality in the 
m x — D constraints in Figs. [7]to[T2j However, since we 
do not really know what to expect for the WIMP speed 
distribution, it is best to explore a variety of empirical 
models. In this work, I considered equal-sized (in speed) 
step functions to model the speed distribution, but it is 
possible that allowing the widths of the step function to 
vary or choosing smoother, continuous basis functions is 
better. For example, in Sec. MI CI I showed that g(v) 
below v ~ 100 km s _1 is not well constrained due to the 
thresholds of the experiments, but that bins of width 
100 km s _1 were too small. Perhaps a better strategy 
would be to have a single bin for speeds for which the 
typical recoil energy lies below threshold, and other-sized 
bins for higher speeds, or to expand the speed distribu- 



tion into a set of orthogonal functions. Reference [57 1 
uses overlapping step-function bins of various sizes. The 
question is, what is the best strategy to search through 
these possibilities? This will depend on the true values 
of the WIMP particle-physics parameters and the speed 
distribution, but it is worth putting some thought into 
how to find empirical hypotheses that will maximize our 
return on the data. 

Finding a good hypothesis for the speed distribution is 
important not just for the sake of constraining the speed 
distribution, but a better hypothesis for the speed distri- 
bution should also lead to less biased inferences for the 
particle-physics properties of WIMPs. As demonstrated 
in Sec. MI CI even though the step-function hypothesis 
leads to good constraints on the speed distribution, the 
inferred values of the particle-physics parameters are off- 
set from their true values. I will delve into this topic 
more in Sec. IIVBI 

However, there are also several theoretical models for 
the speed distribution that would be interesting to test 
using direct-detection data. So far, I have only consid- 
ered Maxwell-Boltzmann theoretical models, which are 
based on arguments along the lines of those found in 
Sec. M Bl and Appendix A of Ref. [7lJ . There are other 
theoretical models for the local s pee d distribution that 
are based on N-body simulations (47j. in particular, for 
the ansatz that the speed distribution is dominated by a 
smooth halo component. It would be interesting to use 
the best empirical models for hypothesis testing against 
a wider class of specific theoretical models. For example, 
if none of the smooth halo predictions fit better than the 
best empirical fit, then this might suggest that the WIMP 
speed distribution is multimodal or that the velocity dis- 
tribution is anisotropic. Multimodal speed distributions 
are a signature of the Milky Way's accretion history, and 
it would be interesting to determine how much one could 
learn about the accretion history based on the direct- 
detection data. 

An interesting question is if one may infer the escape 
speed of WIMPs from the Milky Way. Currently, the best 
constraints on the local escape speed come from measure- 
ments of the radial velocities of local high- velocity stars 
with the RAVE survey [97}. The 90% confidence limits 
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for the escape speed are 498 km s _1 < v csc < 608 km s _1 
and are somewhat model dependent. While there is likely 
a population of WIMPs passing through the Solar Sys- 
tem that are unbound to the Galaxy due to the fact that 
the Galaxy is still accreting matter, it is probably small 
since the Sun sits deep inside the halo. If there is a sharp 
drop off in the distribution function at the escape speed, 
this should leave an imprint in the nuclear recoil spec- 
tra. The question is if this imprint is large enough to 
discern even for optimistic WIMP particle-physics and 
speed-distribution scenarios. A further complication is 
that the types of experiments I considered in this work 
are not sensitive to direction, so there will be some un- 
certainty in the mapping of the geocentric speeds to a 
Galactocentric reference frame. However, this is an in- 
teresting question but I defer a study thereof to future 
work. 

There are a number of more "practical" issues I have 
not addressed yet. First, I have ignored backgrounds, 
energy uncertainties, and systematics. Obviously, these 
experimental realities will affect parameter estimation. 
Furthermore, I have also ignored several theoretical is- 
sues beyond the ansatz of a WIMP model for dark mat- 
ter. For example, I made the ansatz that ap D ^a^ D = 0, 
which is almost certainly not the case in reality. There 
are also uncertainties on the form factor Fsi- The un- 
certainties are even greater for spin-dependent scattering 
[ill Il09| . One way forward is to parametrize all the un- 
certainties, backgrounds, and systematics; throw them all 
into a likelihood function; and search a greatly expanded 
parameter space with the use of the types of Bayesian 
tools I used in this work. 



B. Complementarity with other data sets 

Beyond finding a good model for the speed distribution 
for its own sake, it is useful to characterize the speed dis- 
tribution well for the purpose of WIMP identification. 
Once WIMPs are discovered through multiple channels 
(e.g., produced at the Large Hadron Collider, inferred 
from the observed shower of particles from WIMP anni- 
hilation), one will want to see if the particles discovered 
through these channels are actually of the same type. 
In addition, if the same WIMP particle is responsible 
for all these signals, one will want to characterize the 
WIMP and the theory to which it belongs using the data. 
There have been several studies to investigate how data 
from the LHC and direct-detection experiments can be 
used to constrain specific theories for physics b eyond th e 
standard model, especially supersymmetry [99llll0l4ll2 |. 
Given the high complexity of the parameter spaces of 
these theories, a wide network of points in the param- 
eter space can yield the same set of LHC observables. 
This can lead to estimates of the relic density and spin- 
independent cross sections that span orders of magni- 
tude. It is especially important to be able to estimate 
the relic density for a theory given the data because the 



one thing we know about dark matter to a high degree 
of accura cy is the its abundance in the Universe (e.g., 
[I13llll4j ). If a theory that fits the data reasonably well 
results in a too-large relic abundance of dark matter, then 
it is ruled out. If the theory predicts a relic abundance 
that is significantly below the true relic abundance, it in- 
dicates that the dark matter created in the collider is only 
a subdominant compon ent o f dark matter as a whole. 

References [99| and [lll| have shown that the addi- 
tion of direct-detection data can significantly improve 
constraints on the estimated relic abundance given the 
collider data and a specific theory. However, these au- 
thors fixed the WIMP distribution function in their anal- 
yses. As I have shown in Sec. IIII1 a poor hypoth- 
esis for the speed distribution will result in biases in 
m x and D. Even in Sec. 1111 Cl in which the step- function 
speed-distribution hypothesis was a significantly better 
fit to the multimodal distribution-function data than the 
Maxwcll-Boltzmann hypothesis, it still lead to biases in 
the WIMP particle-physics parameters. If these speed- 
distribution-dependent biases are not fully understood, 
they could lead to incorrect inferences about the WIMP 
particle model. 

In order to facilitate accurate joint analyses among col- 
lider, direct-detection, and indirect-detection data sets, 
I recommend performing parameter inference for a va- 
riety of speed-distribution hypotheses, including several 
empirical models. This will give us a sense of how the 
uncertainty in the speed-distribution model affects, for 
example, the inferred relic density for a specific particle 
model. 



V. CONCLUSION 

In this work, I have created and analyzed mock data 
sets for idealized 2015-era liquid-noble and cryogenic 
direct-detection experiments. The main point of this 
work was to explore how well one might reconstruct the 
WIMP speed distribution in addition to the particle- 
physics properties of WIMPs (mass, cross sections) from 
future data sets using Bayesian inference. The main find- 
ings are the following: 

• Regardless of the true WIMP distribution function 
or of the hypothesis for the form of the WIMP dis- 
tribution function, it is unnecessary at best and 
misleading at worst to place strong priors on the pa- 
rameters of the distribution-function model. Even 
using extremely weak priors on the WIMP mass, 
cross sections, and distribution-function parame- 
ters, one is able to get good constraints on all pa- 
rameters for data sets as small as 200 events for all 
experiments combined. If, however, there are sig- 
nificantly fewer events, parameter constraints will 
be prior dominated if the prior is strong. 

• The constraints improve significantly if the analysis 
windows for the direct-detection experiments are 
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extended as high in energy as possible. 

• Empirical speed-distribution hypotheses lead to 
good reconstruction of the WIMP speed distribu- 
tion. I recommend further investigation into their 
use for direct-detection parameter inference. 

• Even for the modest mock data sets in this work, 
it is possible to use Bayesian model-selection crite- 
ria determine if a specific functional form for the 
distribution function fits better than a simple em- 
pirical speed-distribution model. This is especially 
useful for determining if the local WIMP popula- 
tion is dominated by an equilibrium dark-matter 
halo population or bears significant imprints of the 



Milky Way's accretion history. 
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